Speaker Recognition of Noisy Short Utterance Based on Speech Frame Quality Discrimination and Three-stage Classification Model
نویسندگان
چکیده
The noisy short utterance is polluted by noise and corpus is less, so the recognition rate significantly decreased. For improving recognition rate, we proposed the dual information quality discrimination algorithm to classify the speech frames: one is differences detection and discrimination algorithm (DDADA), another is the improved SNR discrimination algorithm (ISNRDA). Based on the above two algorithms, the speech frames are classified to three classes: high quality, medium quality and low quality. We proposed GMM-UBM three-stage classification model, and we combine the dual information quality discrimination algorithm with GMM-UBM three-stage classification model. Experiments show that, the dual discrimination quality algorithms can be more precise to classify speech frame, and combining it with GMM-UBM three-stage classification model can make full use of limited corpus of short utterance and can improve the speaker recognition rate of the noisy short utterance.
منابع مشابه
Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملSignal Processing of Noisy Short Utterance Based on Noise Separation and Multiple Features Fusion
Recognition rate of noisy short utterance is lower, the two main factors are the inadequate training data and utterance polluted by noisy seriously. In this paper, we proposed corresponding algorithms. First, noise and speech are regarded as parallel information, we use FastICA algorithm to separate pure speech and noise. And then, we use differences detecting and eliminating algorithm (DDAEA) ...
متن کاملExemplar-based sparse representation and sparse discrimination for noise robust speaker identification
Probabilistic modeling is the most successful approach widely used in speaker recognition either for modeling the speakers in GMM-UBM structure or by serving as a prior in secondarylevel feature extraction to form i-vectors. In this paper, we introduce exemplar-based sparse representation and sparse discrimination for closed-set speaker identification in a noisy living room from very short spee...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015